Converting Romanized Persian to the Arabic Writing System
نویسندگان
چکیده
This paper describes a syllabification based conversion method for converting romanized Persian text to the traditional Arabic-based writing system. The system is implemented in Xerox XFST and relies on rule based conversion of words rather than using morphological analysis. The paper presents a brief evaluation of the accuracy of the transcriptions generated by the method.
منابع مشابه
Converting Romanized Persian to the Arabic Writing Systems
This paper describes a syllabification based conversion method for converting romanized Persian text to the traditional Arabic-based writing system. The system is implemented in Xerox XFST and relies on rule based conversion of words rather than using morphological analysis. The paper presents a brief evaluation of the accuracy of the transcriptions generated by the method.
متن کاملApplying Finite State Morphology to Conversion Between Roman and Perso-Arabic Writing Systems
This paper presents a method for converting back and forth between the Perso-Arabic and a Romanized writing systems for Persian. Given a word in one writing system, we use finite state transducers to generate morphological analysis for the word that is subsequently used to regenerate the orthography of the word in the other writing system. The system has been implemented in XFST and LEXC.
متن کاملTajik-Farsi Persian Transliteration Using Statistical Machine Translation
Tajik Persian is a dialect of Persian spoken primarily in Tajikistan and written with a modified Cyrillic alphabet. Iranian Persian, or Farsi, as it is natively called, is the lingua franca of Iran and is written with the Persian alphabet, a modified Arabic script. Although the spoken versions of Tajik and Farsi are mutually intelligible to educated speakers of both languages, the difference be...
متن کاملAutomatic Transliteration of Romanized Dialectal Arabic
In this paper, we address the problem of converting Dialectal Arabic (DA) text that is written in the Latin script (called Arabizi) into Arabic script following the CODA convention for DA orthography. The presented system uses a finite state transducer trained at the character level to generate all possible transliterations for the input Arabizi words. We then filter the generated list using a ...
متن کاملSyllable Based Transcription of English Words into Perso-Arabic Writing System
This paper presents a rule-based method for transcription of English words into the PersoArabic orthography. The method relies on the phonetic representation of English words such as the CMU pronunciation dictionary. Some of the challenging problems are the context-based vowel representation in the Perso-Arabic writing system and the mismatch between the syllabic structures of English and Persi...
متن کامل